Machine Learning Based Keyphrase Extraction: Comparing Decision Trees, Naïve Bayes, and Artificial Neural Networks

نویسندگان

  • Kamal Sarkar
  • Mita Nasipuri
  • Suranjan Ghose
چکیده

The paper presents three machine learning based keyphrase extraction methods that respectively use Decision Trees, Naïve Bayes, and Artificial Neural Networks for keyphrase extraction. We consider keyphrases as being phrases that consist of one or more words and as representing the important concepts in a text document. The three machine learning based keyphrase extraction methods that we use for experimentation have been compared with a publicly available keyphrase extraction system called KEA. The experimental results show that the Neural Network based keyphrase extraction method outperforms two other keyphrase extraction methods that use the Decision Tree and Naïve Bayes. The results also show that the Neural Network based method performs better than KEA. Keywords—Keyphrase Extraction, Decision Tree, Naïve Bayes, Artificial Neural Networks, Machine Learning, WEKA

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparison of Machine Learning Classifiers Applied to Financial Datasets

*Abstract—The main purpose of this project is to analyze several Machine Learning techniques individually and compare the efficiency and classification accuracy of those techniques. Three algorithms are used (Naïve Bayes learning, feed forward Artificial Neural Networks with Backpropagation, and Decision Trees learning using C4.5) over two datasets (“European companies” and “Japanese companies”...

متن کامل

Knowledge Based Analysis of Various Statistical Tools in Detecting Breast Cancer

In this paper, we study the performance criterion of machine learning tools in classifying breast cancer. We compare the data mining tools such as Naïve Bayes, Support vector machines, Radial basis neural networks, Decision trees J48 and simple CART. We used both binary and multi class data sets namely WBC, WDBC and Breast tissue from UCI machine learning depositary. The experiments are conduct...

متن کامل

An Empirical Comparison of Supervised Learning Algorithms in Disease Detection

In this paper empirical comparison is carried out with various supervised algorithms. We studied the performance criterion of the machine learning tools such as Naïve Bayes, Support vector machines, Radial basis neural networks, Decision trees J48 and simple CART in detecting diseases. We used both binary and multi class data sets namely WBC, WDBC, Pima Indians Diabetes database and Breast tiss...

متن کامل

Calculating classifier calibration performance with a custom modification of Weka

Calibration is often overlooked in machine-learning problem-solving approaches, even in situations where an accurate estimation of predicted probabilities, and not only a discrimination between classes, is critical for decision-making. One of the reasons is the lack of readily available open-source software packages which can easily calculate calibration metrics. In order to provide one such to...

متن کامل

Image Classification Using Naïve Bayes Classifier

An image classification scheme using Naïve Bayes Classifier is proposed in this paper. The proposed Naive Bayes Classifier-based image classifier can be considered as the maximum a posteriori decision rule. The Naïve Bayes Classifier can produce very accurate classification results with a minimum training time when compared to conventional supervised or unsupervised learning algorithms. Compreh...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JIPS

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2012